Yambo_vesrion breakdown after running several seconds

You can post here problems arising when using Yambo running on GPU machines. Issues as compilation, submission scripting, performance and other technical aspects realted to the GPU support

Moderators: Davide Sangalli, andrea marini, Daniele Varsano, andrea.ferretti, myrta gruning, Conor Hogan

Post Reply
young
Posts: 22
Joined: Wed Apr 03, 2019 3:50 pm

Yambo_vesrion breakdown after running several seconds

Post by young » Thu May 05, 2022 11:19 am

Dear developers
I have installed the GPU version of Yambo-5.0.4 on IBM Power 9 by nvfortran 21.7, however, the program will breakdown aftering running several seconds.
The error information is following:
#####################
0: copyout Memcpy (host=0x2ca0e9c0, dev=0x7ffed59ef200, size=7064) FAILED: 700(an illegal memory access was encountered)
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[39606,1],2]
Exit code: 127
############################################

Attach the install file and calculation files.
###############################################
./configure CPP=gcc -E -P FPP=gfortran -E -P -cpp --enable-open-mp FC=nvfortran CC=nvc F77=nvfortran MPICC=mpicc --with-mpi-libs=/gpfs/u/home/POPC/POPChnzs/scratch/software/hvhpcInstall/Linux_ppc64le/21.9/comm_libs/mpi/lib --with-mpi-path=/gpfs/u/home/POPC/POPChnzs/scratch/software/hvhpcInstall/Linux_ppc64le/21.9/comm_libs/mpi --with-mpi-libdir=/gpfs/u/home/POPC/POPChnzs/scratch/software/hvhpcInstall/Linux_ppc64le/21.9/comm_libs/mpi/lib --with-mpi-includedir=/gpfs/u/home/POPC/POPChnzs/scratch/software/hvhpcInstall/Linux_ppc64le/21.9/comm_libs/mpi/include MPIFC=mpif90 MPIF77=mpif90 --enable-cuda=cuda11.4,cc60,cc70,cc80 --enable-msgs-comps --enable-time-profile --enable-memory-profile
###########################################


It seems error from mpirun, but I can running GPU version of QE and vasp code using this nvfortran compiler. Could you help me how to solve this problem. Thanks in advance.

Best
Ke Yang
PHD student
The Hongkong Polytechnic university, HK, China
You do not have the required permissions to view the files attached to this post.
Ke Yang
PostDoc
The Hongkong Polytechnic university, HK, China

Nicola Spallanzani
Posts: 23
Joined: Thu Nov 21, 2019 10:15 am

Re: Yambo_vesrion breakdown after running several seconds

Post by Nicola Spallanzani » Fri May 06, 2022 12:59 pm

Dear Ke Yang,
in order to help you I need some additional info about the system that are you using:
1) what model of GPU are mounted on the compute nodes?
2) in every node there are mounted 6 GPUS?
3) in the cluster are available other MPI installation? In particular the Spectrum_MPI by IBM could be very useful.
4) could you send also the file named r-GW_...?

Best regards,
Nicola
Nicola Spallanzani, PhD
S3 Centre, Istituto Nanoscienze CNR and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu

young
Posts: 22
Joined: Wed Apr 03, 2019 3:50 pm

Re: Yambo_vesrion breakdown after running several seconds

Post by young » Fri May 06, 2022 3:44 pm

Dear Nicola,
Thanks for your response.
in order to help you I need some additional info about the system that are you using:
1) what model of GPU are mounted on the compute nodes?

In this cluster, eche node include 6 NVIDIA Tesla V100 GPUs with 32 GiB of memory each

2) in every node there are mounted 6 GPUS?
Yes, each node include 6 GPU

3) in the cluster are available other MPI installation? In particular the Spectrum_MPI by IBM could be very useful.
Yes, Spectrum_MPI by IBM also install in this cluster. However, I do not use Spectrum_MPI to compiler Yambo, I close Spectrum_MPI, and install Yambo by nvidia_hpc_sdk 21.7. Could I know which MPI is better for Yambo. I also meet some error to compile Yambo using Spectrum_MPI.

Thanks very much. I wish your response.

Best Regards
Ke Yang
PHD student
The Hongkong Polytechnic university, HK, China
You do not have the required permissions to view the files attached to this post.
Ke Yang
PostDoc
The Hongkong Polytechnic university, HK, China

Nicola Spallanzani
Posts: 23
Joined: Thu Nov 21, 2019 10:15 am

Re: Yambo_vesrion breakdown after running several seconds

Post by Nicola Spallanzani » Fri May 06, 2022 9:59 pm

Dear Ke Yang,
I suppose that spectrum-mpi is available via a module. Then load the module and check if it is available the wrapper mpipgifort.
In that case you can use this configuration:

Code: Select all

export FC=pgf90 
export F77=pgfortran
export CPP='cpp -E' 
export CC=pgcc 
export FPP="pgfortran -Mpreprocess -E"
export F90SUFFIX=".f90"
export MPIFC=mpipgifort
export MPIF77=mpipgifort
export MPICC=mpipgicc
export MPICXX=mpipgic++

./configure \
	--enable-cuda=cuda11.4,cc70 \
        --enable-mpi --enable-open-mp \
        --enable-msgs-comps \
        --enable-time-profile \
        --enable-memory-profile \
        --enable-par-linalg \
        --with-blas-libs="-lblas" \
        --with-lapack-libs="-llapack" \
	--with-extlibs-path=$HOME/yambo-extlibs

make -j4 core
The only compute capability that you need is the one cc70 considering the model of GPUs mounted in the cluster.
Let me know if this configuration works.

Best regards,
Nicola
Nicola Spallanzani, PhD
S3 Centre, Istituto Nanoscienze CNR and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu

young
Posts: 22
Joined: Wed Apr 03, 2019 3:50 pm

Re: Yambo_vesrion breakdown after running several seconds

Post by young » Sat May 07, 2022 10:29 am

Dear Nicola,
Thanks very much. I followed what you said, how it does not work. It seems the cluster can install PGI compiler, I change PGI to gfortran, the configure processing can work, however, when do configuration, it report only serials version Yambo will be install
#####################
./configure --enable-cuda=cuda11.4,cc70 --enable-mpi --enable-open-mp --enable-msgs-comps --enable-time-profile --enable-memory-profile --enable-par-linalg --with-blas-libs="-lblas" --with-lapack-libs="-llapack" FC=gfortran MPIFC=mpif90 FPP="gfortran -E" F77=gfortran CPP='cpp -E' MPIFC=mpif90 MPIF77=mpif9
checking for mpixlc... mpixlc
checking for a working mpi.h... no
configure: WARNING: could not compile a C mpi test program. YAMBO serial only.
checking driver lib... @ version 1.0.0

####################################
However, when do the making, I also meeting the errors:
############################
f951: Warning: Nonexistent include directory ‘/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/include/’ [-Wmissing-include-dirs]
f951: Warning: Nonexistent include directory ‘/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/include/’ [-Wmissing-include-dirs]
(gfortran -c -O3 -g -mtune=native -fopenmp -Mcuda=cuda11.4,cc70 -Mcudalib=cufft,cublas,cusolver -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/include -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4//include//modules__HDF5_IO_OPENMP_TIMING_CUDA -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/include/headers/common -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/include/headers/parser -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/include/ -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/include/driver -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/include/ -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/v4/serial/include -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/v4/serial/include -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/v4/serial/include -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/include -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/include/ -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/include/ -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/include/driver kind.f90) > /dev/null
gfortran: error: unrecognized command line option ‘-Mcuda=cuda11.4,cc70’
gfortran: error: unrecognized command line option ‘-Mcudalib=cufft,cublas,cusolver’
make[1]: *** [Makefile:236: kind.o] Error 1
make[1]: Leaving directory '/gpfs/u/scratch/POPC/POPChnzs/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/qe_pseudo'
make: *** [config/mk/actions/compile_yambo.mk:2: yambo] Error 2

############################

Then I remove the --enable-cuda parameter, the error become this
###############################################
>>>[Making lib/yambo/interface]<<<
make[1]: Entering directory '/gpfs/u/scratch/POPC/POPChnzs/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/src/interface'
make[1]: *** No targets specified and no makefile found. Stop.
make[1]: Leaving directory '/gpfs/u/scratch/POPC/POPChnzs/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/src/interface'
make: *** [config/mk/actions/compile_yambo.mk:3: yambo] Error 2

#################################

Could I compile by nvfortran? Thanks very much!

Ke Yang
The Hongkong Polytechnic university, HK, China
Ke Yang
PostDoc
The Hongkong Polytechnic university, HK, China

Post Reply