Parallel and efficiency of yambo

Various technical topics such as parallelism and efficiency, netCDF problems, the Yambo code structure itself, are posted here.

Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan, Nicola Spallanzani

Post Reply
matdisor
Posts: 12
Joined: Wed Apr 10, 2013 2:40 pm
Contact:

Parallel and efficiency of yambo

Post by matdisor » Wed Apr 10, 2013 3:35 pm

Hi,

First post here and a newbie to yambo. I wonder if anyone has any benchmark of the efficiency of yambo vs. number of CPUS.
I have tested in a Cray (using libsci) and found that the scaling factor is not that good.

For example, for testing purpose, I ran optics_bse_bss for a MoS2 single layer (15 x 15 x 1 k-mesh, 16 bands, MaxGvecs= 20000, NGsBlkXs= 20). Here is what I got:
1 cpu 01h-25m-44s
12 cpus 16m-20s
24 cpus 13m-50s
48 cpus 13m-25s

If you can point out how to get good scaling as I need to do a few tens or few hundred atoms clusters, I appreciate very much.
Duy Le
Postdoctoral Associate
Department of Physics
University of Central Florida.
Website: http://www.physics.ucf.edu/~dle

User avatar
Daniele Varsano
Posts: 3816
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: Parallel and efficiency of yambo

Post by Daniele Varsano » Wed Apr 10, 2013 3:53 pm

Dear Duy Le,
are you performing full diagonalization in this tests:

Code: Select all

BSSmod=d
or using Haydock algorithms? In the latter way the scaling should be better. In any case, we are working on total new parallelization strategy in order to let yambo run on BlueGene machine (thousands cpu). These MPI/OpenMP features will be available soon in one of the next releases.
Best,

Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

matdisor
Posts: 12
Joined: Wed Apr 10, 2013 2:40 pm
Contact:

Re: Parallel and efficiency of yambo

Post by matdisor » Wed Apr 10, 2013 4:16 pm

I used Haydock method. It seems that this does not relates to linking libraries, as I tried many different way of linking. I got some (little) improvement of walltime but not scaling factor.

Looking forward to the new release. I am still curious if anyone has benchmark for scaling factor.

Best,
Duy Le
Postdoctoral Associate
Department of Physics
University of Central Florida.
Website: http://www.physics.ucf.edu/~dle

User avatar
Conor Hogan
Posts: 111
Joined: Tue Mar 17, 2009 12:17 pm
Contact:

Re: Parallel and efficiency of yambo

Post by Conor Hogan » Wed Apr 10, 2013 5:12 pm

Dear Duy Le,
Thanks for your comments and welcome to Yambo. I'm a little surprised by your reported scaling, as we found the Haydock scales pretty well - I attach results carried out on Marenostrum in Barcelona.
Can you post the first few lines of your config.log, your whole input file (yambo.in or whatever) and tell us what size is the BS Kernel?
Regards,
Conor
You do not have the required permissions to view the files attached to this post.
Dr. Conor Hogan
CNR-ISM, via Fosso del Cavaliere, 00133 Roma, Italy;
Department of Physics and European Theoretical Spectroscopy Facility (ETSF),
University of Rome "Tor Vergata".

matdisor
Posts: 12
Joined: Wed Apr 10, 2013 2:40 pm
Contact:

Re: Parallel and efficiency of yambo

Post by matdisor » Wed Apr 10, 2013 5:39 pm

Conor Hogan wrote:Dear Duy Le,
Thanks for your comments and welcome to Yambo. I'm a little surprised by your reported scaling, as we found the Haydock scales pretty well - I attach results carried out on Marenostrum in Barcelona.
Can you post the first few lines of your config.log, your whole input file (yambo.in or whatever) and tell us what size is the BS Kernel?
Regards,
Conor
Thank Conor,

I was surprised too. I want to see a benchmark elsewhere to confirm that I did do something incorrectly. The following is the end of configure message:

Code: Select all

#
# [VER] 3.3.0 r.1887
#
# [SYS] linux@x86_64
# [SRC] /lustre/scratch/proj/dle-proj/yambo-3.3.0-rev36_lib4
# [BIN] /lustre/scratch/proj/dle-proj/yambo-3.3.0-rev36_lib4/bin
# [FFT]
#
# [ ] Double precision
# [X] Redundant compilation
# [X] MPI
# [X] PW (5.0) support
# [ ] ETSF I/O support
# [X] SCALAPACK
# [X  ] NETCDF/HDF5/Large Files
# [   ] Built-in BLAS/LAPACK/LOCAL
#
# [ CPP ] gcc -E -P
# [  C  ] gcc -g -O2 -D_C_US -D_FORTRAN_US
# [MPICC] mpicc -g -O2 -D_C_US -D_FORTRAN_US
# [ F90 ] pgf90 -O2 -fast -Munroll -Mnoframe -Mdalign -Mbackslash
# [MPIF ] mpif90 -O2 -fast -Munroll -Mnoframe -Mdalign -Mbackslash
# [ F77 ] pgf90 -O2 -fast -Munroll -Mnoframe -Mdalign -Mbackslash
# [Cmain] -Mnomain
# [NoOpt] -O0 -Mbackslash
#
# [ MAKE ] make
# [EDITOR] vim
#
Three input files:
01.in

Code: Select all

setup                        # [R INI] Initialization
MaxGvecs=  20000          RL 
02.in

Code: Select all

em1s                         # [R Xs] Static Inverse Dielectric Matrix
% QpntsRXs
   1 |  64 |                 # [Xs] Transferred momenta
%
% BndsRnXs
  1 | 16 |                   # [Xs] Polarization function bands
%
NGsBlkXs= 20            RL    # [Xs] Response block size
% LongDrXs
 1.000000 | 0.000000 | 0.000000 |        # [Xs] [cc] Electric Field
%
03.in

Code: Select all

optics                       # [R OPT] Optics
bse                          # [R BSK] Bethe Salpeter Equation.
bss                          # [R BSS] Bethe Salpeter Equation solver
BSresKmod= "xc"              # [BSK] Resonant Kernel mode. (`x`;`c`;`d`)
BScplKmod= "none"            # [BSK] Coupling Kernel mode. (`x`;`c`;`d`)
% BSEBands
  1 | 16 |                   # [BSK] Bands range
%
BSENGBlk= 21           RL    # [BSK] Screened interaction block size
BSENGexx= 19993        RL    # [BSK] Exchange components
BSSmod= "h"                  # [BSS] Solvers `h/d/i/t`
% BEnRange
  0.00000 | 5.00000 | eV    # [BSS] Energy range
%
% BDmRange
  0.10000 |  0.10000 | eV    # [BSS] Damping range
%
BEnSteps= 300                # [BSS] Energy steps
% BLongDir
 1.000000 | 0.000000 | 0.000000 |        # [BSS] [cc] Electric Field
%
Attachments are config.log and optics_bse_bss report. The size of BSE kernel is 14175.
config.log
r_optics_bse_bss.log
You may see weird configure options, however, I had to do so to use ftn and libsci.

Sorry if you find any nonsense, I am new to the theory and the code.

Duy
You do not have the required permissions to view the files attached to this post.
Duy Le
Postdoctoral Associate
Department of Physics
University of Central Florida.
Website: http://www.physics.ucf.edu/~dle

User avatar
Davide Sangalli
Posts: 614
Joined: Tue May 29, 2012 4:49 pm
Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
Contact:

Re: Parallel and efficiency of yambo

Post by Davide Sangalli » Fri Apr 12, 2013 10:44 am

Hi again Duy,
Not completely sure what is going on in your compilation - can you also upload the config/setup file? for instance I'm not 100% sure which BLAS/LAPACK/SCALAPACK libraries you are linking to at the end.
Thanks
Conor (using Davide's account!)
PS we are all away at a school these days so cannot really help until we return to work
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/

matdisor
Posts: 12
Joined: Wed Apr 10, 2013 2:40 pm
Contact:

Re: Parallel and efficiency of yambo

Post by matdisor » Mon Apr 29, 2013 4:39 am

Davide Sangalli wrote:Hi again Duy,
Not completely sure what is going on in your compilation - can you also upload the config/setup file? for instance I'm not 100% sure which BLAS/LAPACK/SCALAPACK libraries you are linking to at the end.
Thanks
Conor (using Davide's account!)
PS we are all away at a school these days so cannot really help until we return to work
Sorry for late reply. I was quite busy. As you can see, I use libsci, which is optimal for cray. I also compiled with other lib but did not see that much effect.
I will investigate closely this issue later, mainly because of time. For now, I am using yambo for small system so it does not hurt that much.
Duy Le
Postdoctoral Associate
Department of Physics
University of Central Florida.
Website: http://www.physics.ucf.edu/~dle

Post Reply