Page 1 of 1

yambo cannot run in parallel

Posted: Wed Aug 27, 2014 9:39 am
by zsjan
Dear all:

I have installed Yambo with the following configuration script:
./configure CC=icc FC=ifort
--with-netcdf-include=/home/zhao/yambo-3.4.1/netCDF/install-f/include
--with-netcdf-lib=/home/zhao/yambo-3.4.1/netCDF/install-f/lib
--with-iotk=/home/zhao/qe/espresso-5.0.2/iotk/
--with-p2y=5.0
with the following configuration results:

#
# [VER] 3.4.1 r.3187
#
# [SYS] linux@x86_64
# [SRC] /home/zhao/yambo/yambo-3.4.1
# [BIN] /home/zhao/yambo/yambo-3.4.1/bin
# [FFT] Goedecker Fast Fourier transform with 0 cache
#
# [ ] Double precision
# [X] Redundant compilation
# [X] MPI
# [ ] OpenMP
# [X] PW (5.0) support
# [ ] ETSF I/O support
# [ ] SCALAPACK
# [ ] NETCDF/HDF5/Large Files
# [XX ] Built-in BLAS/LAPACK/LOCAL
#
# [ CPP ] icc -E -ansi
# [ C ] icc -g -O2 -D_C_US -D_FORTRAN_US
# [MPICC] mpicc -g -O2 -D_C_US -D_FORTRAN_US
# [ F90 ] ifort -assume bscc -O3 -ip -xHost
# [MPIF ] mpif90 -assume bscc -O3 -ip -xHost
# [ F77 ] ifort -assume bscc -O3 -ip -xHost
# [Cmain] -nofor_main
# [NoOpt] -assume bscc -O0 -xHost
#
# [ MAKE ] make
# [EDITOR] vim
#

The installed yambo can run successfully in serial with simply the command:
yambo


However, the parallel runs always stuck without no output information after the command:
mpirun -np 6 yambo

I have tried two platform with different openmpi and intel compilers but get the same fate. I don’t think that the version of mpi lead to such failure as discussed in the forum. So could you give me some suggestions?



Shijun Zhao
Peking University

Re: yambo cannot run in parallel

Posted: Wed Aug 27, 2014 9:55 am
by myrta gruning
Dear Shijun Zhao

what do you mean exactly by 'runs always stuck'? Is there any particular error you get? Is it running but doing nothing? Crashing? If the latter is there any message?
what is the input of the job you are trying to run?

About the output, when running in parallel you do not get the output on your screen, but you should get log and report (at least) files. Is this the case?

Best
m

Re: yambo cannot run in parallel

Posted: Tue Sep 16, 2014 3:54 am
by zsjan
myrta gruning wrote:Dear Shijun Zhao

what do you mean exactly by 'runs always stuck'? Is there any particular error you get? Is it running but doing nothing? Crashing? If the latter is there any message?
what is the input of the job you are trying to run?

About the output, when running in parallel you do not get the output on your screen, but you should get log and report (at least) files. Is this the case?

Best
m
Sorry for the late response. I generate the input file by yambo -x with the following file:

HF_and_locXC # [R XX] Hartree-Fock Self-energy and Vxc
EXXRLvcs= 153 RL # [XX] Exchange RL components
%QPkrange # [GW] QP generalized Kpoint/Band indices
1| 8| 1|700|
%
%QPerange # [GW] QP generalized Kpoint/Energy indices
1| 8| 0.0|-1.0|
%

then the logfile ended at
<03s> [FFT-HF/Rho] Mesh size: 36 36 36
<03s> [WF-HF/Rho loader] Wfs (re)loading | | [000%] --(E) --(X)

and the error message is
yhrun: job xx queued and waiting for resources
yhrun: job xx has been allocated resources
yhrun: error: cn17: task 13: Killed
yhrun: First task exited 60s ago
yhrun: tasks 1-8,10-11,14-23: running
yhrun: tasks 0,9,12-13: exited abnormally
yhrun: Terminating job step 580464.0
slurmd[cn17]: *** STEP xx KILLED AT 2014-09-16T10:59:16 WITH SIGNAL 9 ***

The run is normal if I just use yambo command. So what could be the problem?

Re: yambo cannot run in parallel

Posted: Tue Sep 16, 2014 9:04 am
by Daniele Varsano
Dear Shijun,
please remember to fill you signature with your affiliation, you can do it once for all adding it in your profile.
The code starts, so it does not look a problem of compilation, but with the few information you provided it is impossible to spot the problem.
Could be a memory issue as well a problem of your launching script or your machine. Any report file? What happen if you launch a parallel job by hand:
e.g. mpirun -np 4 yambo ...
or similar command depending on you mpi libraries?
OR may be you can also ask to your system administrator.

Best,
Daniele