Page 1 of 1

Yambo_vesrion breakdown after running several seconds

Posted: Thu May 05, 2022 11:19 am
by young
Dear developers
I have installed the GPU version of Yambo-5.0.4 on IBM Power 9 by nvfortran 21.7, however, the program will breakdown aftering running several seconds.
The error information is following:
#####################
0: copyout Memcpy (host=0x2ca0e9c0, dev=0x7ffed59ef200, size=7064) FAILED: 700(an illegal memory access was encountered)
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[39606,1],2]
Exit code: 127
############################################

Attach the install file and calculation files.
###############################################
./configure CPP=gcc -E -P FPP=gfortran -E -P -cpp --enable-open-mp FC=nvfortran CC=nvc F77=nvfortran MPICC=mpicc --with-mpi-libs=/gpfs/u/home/POPC/POPChnzs/scratch/software/hvhpcInstall/Linux_ppc64le/21.9/comm_libs/mpi/lib --with-mpi-path=/gpfs/u/home/POPC/POPChnzs/scratch/software/hvhpcInstall/Linux_ppc64le/21.9/comm_libs/mpi --with-mpi-libdir=/gpfs/u/home/POPC/POPChnzs/scratch/software/hvhpcInstall/Linux_ppc64le/21.9/comm_libs/mpi/lib --with-mpi-includedir=/gpfs/u/home/POPC/POPChnzs/scratch/software/hvhpcInstall/Linux_ppc64le/21.9/comm_libs/mpi/include MPIFC=mpif90 MPIF77=mpif90 --enable-cuda=cuda11.4,cc60,cc70,cc80 --enable-msgs-comps --enable-time-profile --enable-memory-profile
###########################################


It seems error from mpirun, but I can running GPU version of QE and vasp code using this nvfortran compiler. Could you help me how to solve this problem. Thanks in advance.

Best
Ke Yang
PHD student
The Hongkong Polytechnic university, HK, China

Re: Yambo_vesrion breakdown after running several seconds

Posted: Fri May 06, 2022 12:59 pm
by Nicola Spallanzani
Dear Ke Yang,
in order to help you I need some additional info about the system that are you using:
1) what model of GPU are mounted on the compute nodes?
2) in every node there are mounted 6 GPUS?
3) in the cluster are available other MPI installation? In particular the Spectrum_MPI by IBM could be very useful.
4) could you send also the file named r-GW_...?

Best regards,
Nicola

Re: Yambo_vesrion breakdown after running several seconds

Posted: Fri May 06, 2022 3:44 pm
by young
Dear Nicola,
Thanks for your response.
in order to help you I need some additional info about the system that are you using:
1) what model of GPU are mounted on the compute nodes?

In this cluster, eche node include 6 NVIDIA Tesla V100 GPUs with 32 GiB of memory each

2) in every node there are mounted 6 GPUS?
Yes, each node include 6 GPU

3) in the cluster are available other MPI installation? In particular the Spectrum_MPI by IBM could be very useful.
Yes, Spectrum_MPI by IBM also install in this cluster. However, I do not use Spectrum_MPI to compiler Yambo, I close Spectrum_MPI, and install Yambo by nvidia_hpc_sdk 21.7. Could I know which MPI is better for Yambo. I also meet some error to compile Yambo using Spectrum_MPI.

Thanks very much. I wish your response.

Best Regards
Ke Yang
PHD student
The Hongkong Polytechnic university, HK, China

Re: Yambo_vesrion breakdown after running several seconds

Posted: Fri May 06, 2022 9:59 pm
by Nicola Spallanzani
Dear Ke Yang,
I suppose that spectrum-mpi is available via a module. Then load the module and check if it is available the wrapper mpipgifort.
In that case you can use this configuration:

Code: Select all

export FC=pgf90 
export F77=pgfortran
export CPP='cpp -E' 
export CC=pgcc 
export FPP="pgfortran -Mpreprocess -E"
export F90SUFFIX=".f90"
export MPIFC=mpipgifort
export MPIF77=mpipgifort
export MPICC=mpipgicc
export MPICXX=mpipgic++

./configure \
	--enable-cuda=cuda11.4,cc70 \
        --enable-mpi --enable-open-mp \
        --enable-msgs-comps \
        --enable-time-profile \
        --enable-memory-profile \
        --enable-par-linalg \
        --with-blas-libs="-lblas" \
        --with-lapack-libs="-llapack" \
	--with-extlibs-path=$HOME/yambo-extlibs

make -j4 core
The only compute capability that you need is the one cc70 considering the model of GPUs mounted in the cluster.
Let me know if this configuration works.

Best regards,
Nicola

Re: Yambo_vesrion breakdown after running several seconds

Posted: Sat May 07, 2022 10:29 am
by young
Dear Nicola,
Thanks very much. I followed what you said, how it does not work. It seems the cluster can install PGI compiler, I change PGI to gfortran, the configure processing can work, however, when do configuration, it report only serials version Yambo will be install
#####################
./configure --enable-cuda=cuda11.4,cc70 --enable-mpi --enable-open-mp --enable-msgs-comps --enable-time-profile --enable-memory-profile --enable-par-linalg --with-blas-libs="-lblas" --with-lapack-libs="-llapack" FC=gfortran MPIFC=mpif90 FPP="gfortran -E" F77=gfortran CPP='cpp -E' MPIFC=mpif90 MPIF77=mpif9
checking for mpixlc... mpixlc
checking for a working mpi.h... no
configure: WARNING: could not compile a C mpi test program. YAMBO serial only.
checking driver lib... @ version 1.0.0

####################################
However, when do the making, I also meeting the errors:
############################
f951: Warning: Nonexistent include directory ‘/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/include/’ [-Wmissing-include-dirs]
f951: Warning: Nonexistent include directory ‘/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/include/’ [-Wmissing-include-dirs]
(gfortran -c -O3 -g -mtune=native -fopenmp -Mcuda=cuda11.4,cc70 -Mcudalib=cufft,cublas,cusolver -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/include -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4//include//modules__HDF5_IO_OPENMP_TIMING_CUDA -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/include/headers/common -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/include/headers/parser -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/include/ -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/include/driver -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/include/ -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/v4/serial/include -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/v4/serial/include -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/v4/serial/include -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/include -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/external/gfortran/gfortran/include/ -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/include/ -I/gpfs/u/home/POPC/POPChnzs/scratch/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/include/driver kind.f90) > /dev/null
gfortran: error: unrecognized command line option ‘-Mcuda=cuda11.4,cc70’
gfortran: error: unrecognized command line option ‘-Mcudalib=cufft,cublas,cusolver’
make[1]: *** [Makefile:236: kind.o] Error 1
make[1]: Leaving directory '/gpfs/u/scratch/POPC/POPChnzs/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/qe_pseudo'
make: *** [config/mk/actions/compile_yambo.mk:2: yambo] Error 2

############################

Then I remove the --enable-cuda parameter, the error become this
###############################################
>>>[Making lib/yambo/interface]<<<
make[1]: Entering directory '/gpfs/u/scratch/POPC/POPChnzs/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/src/interface'
make[1]: *** No targets specified and no makefile found. Stop.
make[1]: Leaving directory '/gpfs/u/scratch/POPC/POPChnzs/software/yambo/gpuwithcuda/spectrumMPI/yambo-5.0.4/lib/yambo/driver/src/interface'
make: *** [config/mk/actions/compile_yambo.mk:3: yambo] Error 2

#################################

Could I compile by nvfortran? Thanks very much!

Ke Yang
The Hongkong Polytechnic university, HK, China

Re: Yambo_vesrion breakdown after running several seconds

Posted: Mon Jul 18, 2022 9:13 am
by Nicola Spallanzani
Dear Ke Yang,
sorry for the late reply. pgfortran is automatically installed in the nvidia-sdk and it is an alias to nvfortran. So using pgfortran is the same of using nvfortran. Otherwise you have to specify pgfortran because of the spectrum_mpi wrapper mpipgifort.

If you are finding errors compiling with spectrum_mpi probably you need to launch "make distclean" before the configure, or even better, restart from scratch in another directory.

Best regards,
Nicola